\subsection{Stochastic filtering framework}
Let the unknown state $\psi$ be a realization of a random variable with probability density function (pdf) $\phi(\psi)$. In general, the $\psi$ can be a general spatio-temporal field defined on a space-time domain.  The observations $y$ are also random variables with distribution $\phi(y|\psi)$. Let that sequence of random variables $\psi$ from time $k$ upto $k'$ be represented as $\psi_{k:k'}$. Similarly for $y$. The maximum likelihood estimate of $\psi$ is the maximizer of $\phi(\psi|y)$ which can be written in terms of $\phi(y|\psi)$ and $\phi(\psi)$ using Bayes' rule:
\[\phi(\psi_{0:k}|y_{1:k})=\frac{\phi(y_{1:k}|\psi_{0:k})\phi(\psi_{0:k})}{\int \phi(y_{1:k}|\psi_{0:k})\phi(\psi_{0:k}) d\psi}\]
We make following assumptions: the state evolution is governed by a Markov process, that is, $\phi(\psi_{k}|\psi_{0:k-1})=\phi(\psi_{k}|\psi_{k-1})$; observations are different times are conditionally independent given the state, that is, $\phi(y_{k},y_{k'}|\psi)=\phi(y_{k}|\psi)\phi(y_{k'}|\psi)$; observations are independent of the past states given the current state, that is, $\phi(y_{k}|\psi_{0:k})=\phi(y_{k}|\psi_{k})$. 
Using these assumptions, we can write the equation of a \emph{general smoother}:
\begin{eqnarray*}
\phi(\psi_{0:k}|y_{1:k})&=&\frac{\phi(y_{1:k}|\psi_{0:k})\phi(\psi_{0:k})}{\int \phi(y_{1:k}|\psi_{0:k})\phi(\psi_{0:k}) d\psi}\\
&=&\frac{\phi(y_{1}|\psi_{1})\ldots \phi(y_{k}|\psi_{k}) \phi(\psi_{0})\phi(\psi_{1}|\psi_{0})\ldots \phi(\psi_{k}|\psi_{k-1})}{\int(\ldots) d\psi}\\
&=&\frac{\phi(\psi_{0})\phi(\psi_{1}|\psi_{0})\phi(y_{1}|\psi_{1}) \ldots \phi(\psi_{k}|\psi_{k-1})\phi(y_{k}|\psi_{k})}{\int(\ldots) d\psi}\\
&=&\frac{\phi(\psi_{0:k-1}|y_{1:k-1})\phi(\psi_{k}|\psi_{k-1})\phi(y_{k}|\psi_{k})}{\int(\ldots) d\psi}\\
\end{eqnarray*}

Similarly, the equation of a \emph{general filter} can be written as 
\begin{eqnarray*}
\phi(\psi_{k}|d_{1:k})&=&\int \phi(\psi_{0:k}|d_{1:k})d\psi'\\
&=&\frac{\phi(d_{k}|\psi_{k})\int\phi(\psi_{0:k-1}|d_{1:k-1})\phi(\psi_{k}|\psi_{k-1})d\psi}{\int(\ldots)d\psi'}\\
&=&\frac{\phi(d_{k}|\psi_{k})\phi(\psi_{k}|d_{1:k-1})}{\int(\ldots)d\psi}
\end{eqnarray*}

In literature, the $\phi(\psi_{k}|\psi_{k-1})$ is referred as state-evolution model or system dynamics, $\phi(y_{k}|\psi_{k})$ is referred as observation model or sensor equation, $\phi(\psi_{k}|y_{1:k-1})$ is referred as prediction or predictive belief and $\phi(\psi_{k}|y_{0:k})$ is referred as the posterior belief. \\

Kalman filter (Kf) approximates these densities by their first and second moments, i.e., mean and covariance. It can be shown that under the assumption that initial belief~$\phi(\psi_{0})$ is Gaussian, and that the observation model and the system dynamics are both linear in state with additive Gaussian noise, the Kf is gives the minimum mean square estimate (MMSE) of the state. Kf does not give optimal performance under nonlinearities.   